Search CORE

6 research outputs found

All-rounder: A flexible DNN accelerator with diverse data format support

Author: Jang Yongjoo
Kung Jaeha
Lee Seungpyo
Noh Seock-Hwan
Park Sehun
Shin Banseok
Publication venue
Publication date: 25/10/2023
Field of study

Recognizing the explosive increase in the use of DNN-based applications, several industrial companies developed a custom ASIC (e.g., Google TPU, IBM RaPiD, Intel NNP-I/NNP-T) and constructed a hyperscale cloud infrastructure with it. The ASIC performs operations of the inference or training process of DNN models which are requested by users. Since the DNN models have different data formats and types of operations, the ASIC needs to support diverse data formats and generality for the operations. However, the conventional ASICs do not fulfill these requirements. To overcome the limitations of it, we propose a flexible DNN accelerator called All-rounder. The accelerator is designed with an area-efficient multiplier supporting multiple precisions of integer and floating point datatypes. In addition, it constitutes a flexibly fusible and fissionable MAC array to support various types of DNN operations efficiently. We implemented the register transfer level (RTL) design using Verilog and synthesized it in 28nm CMOS technology. To examine practical effectiveness of our proposed designs, we designed two multiply units and three state-of-the-art DNN accelerators. We compare our multiplier with the multiply units and perform architectural evaluation on performance and energy efficiency with eight real-world DNN models. Furthermore, we compare benefits of the All-rounder accelerator to a high-end GPU card, i.e., NVIDIA GeForce RTX30390. The proposed All-rounder accelerator universally has speedup and high energy efficiency in various DNN benchmarks than the baselines

arXiv.org e-Print Archive

Deep Partitioned Training from Near-Storage Computing to DNN Accelerators

Author: Jang Yongjoo
Kim Daehoon
Kim Sejin
Kung Jaeha
Lee Sungjin
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2021
Field of study

In this paper, we present deep partitioned training to accelerate computations involved in training DNN models. This is the first work that partitions a DNN model across storage devices, an NPU and a host CPU forming a unified compute node for training workloads. To validate the benefit of using the proposed system during DNN training, a trace-based simulator or an FPGA prototype is used to estimate the overall performance and obtain the layer index to be partitioned that provides the minimum latency. As a case study, we select two benchmarks, i.e., vision-related tasks and a recommendation system. As a result, the training time reduces by 12.2~31.0% with four near-storage computing devices in vision-related tasks with a mini-batch size of 512 and 40.6~44.7% with one near-storage computing device in the selected recommendation system with a mini-batch size of 64. CCBY1

DGIST Library Institutional Repository

SEMS: Scalable Embedding Memory System for Accelerating Embedding-Based DNNs

Author: Jang Yongjoo
Kim Jungwoo
Kim Sejin
Kung Jaeha
Lee Sungjin
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/12/2022
Field of study

Embedding layers, which are widely used in various deep learning (DL) applications, are very large in size and are increasing. We propose scalable embedding memory system (SEMS) to deal with the inference of DL applications with a large embedding layer. SEMS is built using scalable embedding memory (SEM) modules, which include FPGA for acceleration. In SEMS, PCIe bus, which is scalable and versatile, is used to expand the system memory and processing in SEMs reduces the amount of data transferred from SEMs to host, improving the effective bandwidth of PCIe. In order to achieve better performance, we apply various optimization techniques at different levels. We develop SEMlib, a Python library to provide convenience in using SEMS. We implement a proof-of-concept prototype of SEMS and using SEMS yields DLRM execution time that is 32.85x faster than that of a CPU-based system when there is a lack of DRAM to hold the entire embedding layer. © 2022 IEEE.FALS

DGIST Library Institutional Repository

Simulation and Fabrication of Nanoscale Spirals Based on Dual-Scale Self-Assemblies

Author: Gun Ho Lee
Hachtel J. A.
Min Seok Jang
Shinho Kim
Yeon Sik Jung
YongJoo Kim
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref

Shape control of nanostructured cone-shaped particles by tuning the blend morphology of A-b-B diblock copolymers and C-type copolymers within emulsion droplets

Author: Bhaskar
Bumjoon J. Kim
Champion
Chen
Chen
Deegan
Deng
Deng
Deng
Eun Ji Kim
Gröschel
Higuchi
Higuchi
Hongseok Yun
Jae Man Shin
Jang
Je
Jeon
Jeon
Jin
Kang Hee Ku
Kim
Kim
Kim
Kim
Klinger
Koizumi
Ku
Ku
Ku
Ku
Ku
Ku
Kumar
Lee
Park
Park
Roh
Sacanna
Saito
Saito
Saito
Schmidt
Schmidt
Schoonen
Schultz
Shi
Shin
Shin
Shin
Shin
Staff
Stoychev
Tanaka
Tu
Velikov
Wang
Wang
Wilmes
Wu
Wyman
Xu
Yabu
Yan
Yang
Yang
Yang
YongJoo Kim
Yunker
Yunker
Zhang
Zhao
Zhou
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/2019
Field of study

Block copolymers (BCPs) under colloidal confinement can provide an effective route to produce nonspherical particles. However, the resulting structures are typically limited to spheroids, and it remains challenging to achieve a higher level of control in the particle shape with different symmetries. Herein, we exploit the blend of BCPs and statistical copolymers (sCPs) within emulsion droplets to develop a series of particles with different symmetries (i. e. Janus-sphere and cone-shaped particles). The particle shape is tunable by controlling the phase behavior of the polymer blend consisting of a poly(styrene-block-1,4butadiene) (PS-b-PB) BCP and a poly(methylmethacrylate-statistical-(4-acryloylbenzophenone)) (P(MMAstat- 4ABP)) sCP. A key strategy for controlling the phase separation of the polymer blend is to systematically tune the incompatibility between the BCP and sCP by varying the composition of the sCPs (.4ABP, mole fraction of 4ABP). As a result, a sequential morphological transition from a prolate ellipsoid, to a Janus-sphere, to a cone-shaped particle is observed with the increase of.4ABP. We further demonstrate that the shape-anisotropy of cone-shaped particles can be tailored by controlling the particle size and the Janusity, which is supported by quantitative calculation of the particle shape-anisotropy from the theoretical model. Also, the importance of the shape control of the cone-shaped particles with high uniformity in a batch is demonstrated by investigating their coating properties, in which the deposited coating pattern is a strong function of the shape-anisotropy of the particles

Crossref

ScholarWorks@UNIST

The Effect of Incumbency in National and Local Elections: Evidence from South Korea

Author: Ah Hwang
Alan I Abramowitz
Alexander Lee
Andrew C Eggers
Andrew C Eggers
Andrew Gelman
Ansolabehere
B K Song
BK Song
Bobbie Macdonald
Brian Gaines
Bruce Cain
C Kendall
Calonico
Calonico
Chalmers Johnson
David R Mayhew
David S Lee
David S Lee
Devin Caughey
Deyo
Frederic C Deyo
Gary C Jacobson
Gary C Jacobson
Gary W Cox
Glenn R Parker
Gonsoo Lee
Guido Imbens
Hagen Koo
Hyun-Woo Lee
Jae Shin
Jens Hainmueller
Jin-Young Jeon
John A Ferejohn
Jongbin Yoon
Jongbin Yoon
Jongbin Yoon
Jung Gil
Justin Grimmer
Kap-Yun Lee
Kelly D Patterson
Kenichi Ariga
Kenichi Ariga
Keon-Sup Song
Kyungmee Park
Leandro De Magalhaes
Marko Klasnja
Marko Klasnja
Morris P Fiorina
Richard E Matrand
Richard F Fenno
Robert S Erikson
Robert Wade
Santosh Anagol
Scott Ashworth
Seung-Jin Jang
Stephan Haggard
Stephen Ansolabehere
Walter J Stone
Won-Ho Park
Woo Chang Kang
Yogesh Uppal
Yongjik Moon
Yongjoo Jeon
Yusaku Horiuchi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Crossref